Skip to content

docs: socket-analysis guide — RTT bands & clustering for data teams#42

Merged
randomizedcoder merged 1 commit into
docs/parquet-formatfrom
docs/socket-analysis
Jun 20, 2026
Merged

docs: socket-analysis guide — RTT bands & clustering for data teams#42
randomizedcoder merged 1 commit into
docs/parquet-formatfrom
docs/socket-analysis

Conversation

@randomizedcoder

Copy link
Copy Markdown
Owner

Adds docs/socket-analysis.md — the analysis companion to the Parquet docs, for a data/analytics team that wants to find the natural RTT bands (and other socket groupings) in the fleet statistically.

Stacked on #41 (docs/parquet-format) so the cross-links resolve. Merge order: #39#41 → this; each retargets to main as its base merges.

What it covers

  • The RTT-band mental model — intra-DC / metro-CDN / regional / outliers / mobile — framed as hypotheses the data must confirm, that drift over time and differ per DC.
  • Right signalmin_rtt (not srtt) on a log scale.
  • Data prep — the critical "aggregate to one row per socket, not per poll" step (key on socket_cookie+hostname+netns), filtering (ESTABLISHED, loopback, survivorship), unit conversions, cumulative-counter handling, deriving the DC dimension.
  • Finding RTT bands — GMM+BIC (recommended, adaptive/drift-aware), Jenks/KDE (simple), Snowflake NTILE/APPROX_PERCENTILE (quick win, with the quantiles≠modes caveat); labeling/validating against dest_asn/geo/port; tracking centroids per (DC, day).
  • Multi-feature clustering{log min_rtt, log throughput, retrans_rate, rel jitter, log cwnd} via K-means/GMM/HDBSCAN (recommended — finds the outlier band as noise); validation.
  • Other analyses — throughput bands, loss bands, congestion-algo comparison, per-ASN/CDN, diurnal, drift/anomaly.
  • Worked example — DuckDB/Snowflake feature SQL → Python GMM+BIC and HDBSCAN → Snowflake quick bands.
  • Pitfalls — µs units, per-socket grain, cumulative counters, survivorship, app-limited throughput, raw-byte addresses, band drift, per-host clocks.

Notes

  • Every xtcp column named in the doc was verified to exist in the schema; code snippets are illustrative/adaptable (not run in CI). Soft-wrapped to match the repo convention; links/anchors verified.
  • Cross-linked from the docs hub and parquet-format.md.

🤖 Generated with Claude Code

New docs/socket-analysis.md: a methodology guide for finding the natural RTT
bands statistically (min_rtt on a log scale; GMM+BIC for adaptive, drift-aware
bands; Jenks/KDE simple alternative; Snowflake quantile quick-win), with
labeling/validation against dest ASN/geo and per-DC/over-time tracking. Adds
multi-feature clustering (HDBSCAN) and other groupings (throughput, loss,
congestion algo, per-ASN, diurnal), a worked SQL→Python example, and a
pitfalls section (per-socket grain, cumulative counters, µs units, survivorship,
app-limited throughput, drift). Cross-linked from the docs hub and parquet doc.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant